Ch2: Small worlds

Jesse Brunner

What do we mean by small worlds?

Simply this:

  • we are working with models
  • models are simplification of reality
    • they are incomplete
    • they are myopic

Probability as counting

  • For each possible explanation of the data (i.e., possible value of \(p\))
  • Count all of the ways the data could happen (i.e., plausibility of data | \(p_{proposed}\))
  • Explanations with more ways to produce them are more plausible

Probability as counting

Most formulas for probabilities are just short-cuts for counting (or integrating). E.g., the binomial distribution:

\[\begin{align} \Pr(k \text{ of } n | p) &= \binom{n}{k}p^k(1-p)^{n-k} \\ &= \frac{n!}{k!(n-k)!} p^k(1-p)^{n-k} \end{align}\]

Probability as counting

Most formulas for probabilities are just short-cuts for counting (or integrating). E.g., the binomial distribution:

\[\begin{align} \Pr(k \text{ of } n | p) &= \binom{n}{k}p^k(1-p)^{n-k} \\ &= \frac{n!}{k!(n-k)!} p^k(1-p)^{n-k} \end{align}\]

Note: This formula provides us a short-cut for calculating the probability of observing \(k\) of \(n\) “successes” given a particular value of the parameter \(p\).

Don’t fret! Just think of constraints!

We choose statistical distributions because of

  1. Theory (someone else figured it out for you!)

  2. Constraints (stuff you know)

    • Discrete vs. continuous?
    • Bounded? Positive?
    • A/symmetric?

We will learn along the way

Bayes theorem

\[\begin{align} \Pr(p_i | k) &= \frac{\text{Probability of data} | p_i\times \text{Prior probability of }p_i}{\text{Probability of the data overall}} \end{align}\]

But let’s not get caught up in the math! (Watch the 3B1B video… it’s better!)

Bayes theorem

A Bayesian modeling perspective

  • Generative models are Bayesian models
    • start from there… takes longer, but more powerful
  • Workflow is important
  • Variables: data are observed variables, parameters are unobserved variables
  • Distributions: likelihoods and priors are both distributions
  • Indexing: there is no random or fixed effects, just indexed variables

Parts of the model

  • Data (e.g., water or land in globe-flipping example)
  • Description of the distribution of the data | parameters (NB: likelihood)
    • \(W \sim \text{binomial}(n, p)\)
    • can also involve description of how parameters vary with data
      • (e.g., \(\text{logit}(p_i) = \alpha + \beta \times x_i\))
  • Description of the prior probability of the parameters
    • \(p \sim \text{beta}(1, 2)\)
  • Some engine to do the calculations
    • integration (nope) & conjugate priors (sometimes you get lucky!)
    • grid approximation
    • quadratic approximation (quap())
    • MCMC, WinBugs, JAGS, Stan (ulam())